On-chip trainable hardware-based deep Q-networks approximating a backpropagation algorithm
نویسندگان
چکیده
Abstract Reinforcement learning (RL) using deep Q-networks (DQNs) has shown performance beyond the human level in a number of complex problems. In addition, many studies have focused on bio-inspired hardware-based spiking neural networks (SNNs) given capabilities these technologies to realize both parallel operation and low power consumption. Here, we propose an on-chip training method for DQNs applicable SNNs. Because conventional backpropagation (BP) algorithm is approximated, evaluation based two simple games shows that proposed system achieves similar software-based system. The can minimize memory usage reduce consumption area occupation levels. particular, problems, dependency be significantly reduced high achieved without replay memory. Furthermore, investigate effect nonlinearity characteristics types variation non-ideal synaptic devices outcomes. this work, thin-film transistor (TFT)-type flash cells are used as devices. A simulation also conducted fully connected network with non-leaky integrated-and-fire (I&F) neurons. strong immunity device variations because scheme adopted.
منابع مشابه
Cost-aware Topology Customization of Mesh-based Networks-on-Chip
Nowadays, the growing demand for supporting multiple applications causes to use multiple IPs onto the chip. In fact, finding truly scalable communication architecture will be a critical concern. To this end, the Networks-on-Chip (NoC) paradigm has emerged as a promising solution to on-chip communication challenges within the silicon-based electronics. Many of today’s NoC architectures are based...
متن کاملDeep Generative Stochastic Networks Trainable by Backprop
We introduce a novel training principle for probabilistic models that is an alternative to maximum likelihood. The proposed Generative Stochastic Networks (GSN) framework is based on learning the transition operator of a Markov chain whose stationary distribution estimates the data distribution. The transition distribution of the Markov chain is conditional on the previous state, generally invo...
متن کاملTrainable hardware for dynamical computing using error backpropagation through physical media
Neural networks are currently implemented on digital Von Neumann machines, which do not fully leverage their intrinsic parallelism. We demonstrate how to use a novel class of reconfigurable dynamical systems for analogue information processing, mitigating this problem. Our generic hardware platform for dynamic, analogue computing consists of a reciprocal linear dynamical system with nonlinear f...
متن کاملUser-based Vehicle Route Guidance in Urban Networks Based on Intelligent Multi Agents Systems and the ANT-Q Algorithm
Guiding vehicles to their destination under dynamic traffic conditions is an important topic in the field of Intelligent Transportation Systems (ITS). Nowadays, many complex systems can be controlled by using multi agent systems. Adaptation with the current condition is an important feature of the agents. In this research, formulation of dynamic guidance for vehicles has been investigated based...
متن کاملTraining Deep Spiking Neural Networks Using Backpropagation
Deep spiking neural networks (SNNs) hold the potential for improving the latency and energy efficiency of deep neural networks through data-driven event-based computation. However, training such networks is difficult due to the non-differentiable nature of spike events. In this paper, we introduce a novel technique, which treats the membrane potentials of spiking neurons as differentiable signa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neural Computing and Applications
سال: 2021
ISSN: ['0941-0643', '1433-3058']
DOI: https://doi.org/10.1007/s00521-021-05699-z